Using knowledge to organize sound: The prediction-driven approach to computational auditory scene analysis and its application to speech/nonspeech mixtures
نویسنده
چکیده
Computational auditory scene analysis – modeling the human ability to organize sound mixtures according to their sources – has experienced a rapid evolution as the simple principles suggested by psychological experiments have turned out to be less than the whole story. Phenomena such as the continuity illusion and phonemic restoration show that the brain is able to use a wide range of knowledge-based contextual constraints when interpreting obscured or complex mixtures: To model such processing, we need architectures that operate by confirming hypotheses about the observations rather than relying on directly-extracted descriptions. One such architecture, the 'prediction-driven' approach, is presented along with results from its initial implementation. This architecture can be extended to take advantage of the high-level knowledge implicit in today's speech rec-ognizers by modifying a recognizer to act as one of the 'component models' which provide the explanations of the signal mixture. Although this adaptation raises a number of issues, a preliminary investigation supports the argument that successful scene analysis must exploit such abstract knowledge at every level.
منابع مشابه
Computational Auditory Scene Analysis Exploiting Speech-recognition Knowledge
The field of computational auditory scene analysis (CASA) strives to build computer models of the human ability to interpret sound mixtures as the combination of distinct sources. A major obstacle to this enterprise is defining and incorporating the kind of high level knowledge of real-world signal structure exploited by listeners. Speech recognition, while typically ignoring the problem of non...
متن کاملPrediction-driven Computational Auditory Scene Analysis for Dense Sound Mixtures
We interpret the sound reaching our ears as the combined effect of independent, sound-producing entities in the external world; hearing would have limited usefulness if were defeated by overlapping sounds. Computer systems that are to interpret real-world sounds – for speech recognition or for multimedia indexing – must similarly interpret complex mixtures. However, existing functional models o...
متن کاملToward Automatic Sound Source Recognition: Identifying Musical Instruments
One of the broad goals of research in computational auditory scene analysis (CASA) is to create computer systems that can learn to recognize sound sources in a complex auditory environment. In this paper, a set of acoustic features is proposed that relate to the physical properties of sound-producing objects. In particular, a set of orchestral musical instrument sounds is presented as represent...
متن کاملThe auditory organization of speech and other sources in listeners and computational models
Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process. In the first stage, sound is decomposed both within and across auditory nuclei. Subsequent processes of perceptual organisation are informed...
متن کاملTitle : The auditory organization of speech and other sources in listeners and computational models
Speech is typically perceived against a background of other sounds. Listeners are adept at extracting target sources from the acoustic mixture reaching the ears. The auditory scene analysis account holds that this feat is the result of a two stage process: In the first stage sound is decomposed into collections of fragments in several dimensions. Subsequent processes of perceptual organization ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Speech Communication
دوره 27 شماره
صفحات -
تاریخ انتشار 1999